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EXECUTIVE  SUMMARY 


Research  on  when  and  how  to  use  three-dimensional  (3-D)  perspective  views  on  flat  screens  for 
military  operational  tasks  such  as  air  traffic  control  is  confusing  and  contradictory.  Considering  the 
basic  qualities  and  capabilities  of  two-dimensional  (2-D)  and  3-D  views,  we  conducted  two 
experiments.  Because  perspective  views  integrate  all  the  dimensions,  we  hypothesized  that  3-D 
views  are  better  for  object  understanding.  In  contrast,  we  hypothesized  that  2-D  views  are  better  for 
judging  the  relative  position  of  objects  because  each  dimension  can  be  isolated.  Participants  viewed 
simple  block  shapes  in  2-D  or  3-D  and  either  performed  an  object  understanding  task  (e.g., 
identification,  mental  rotation)  or  a  relative  position  task  (e.g.,  directions  and  distances  between 
objects).  We  found  that  a  3-D  perspective  view  was  far  superior  to  2-D  views  for  understanding  the 
shape  of  the  simple  blocks,  but  2-D  views  were  better  than  3-D  views  for  comprehending  the  relative 
position  of  two  objects. 
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INTRODUCTION 


Many  military  operational  tasks  require  the  comprehension  of  three-dimensional  (3-D)  objects  and 
environments.  For  example,  the  perception  and  understanding  of  a  3-D  airspace  are  required  for  air 
route  planning,  fly  and  no-fly  zones,  terrain  visualization,  enemy  targets,  and  enemy  radar  and  air 
defense  zones.  Surface  warfare  officers  have  suggested  that  the  display  of  tactical  information  in 
3-D  would  aid  in  assessing  the  force  structure  of  friends,  neutrals,  possible  adversaries,  and 
noncombatants  (Kribs,  Eddy,  and  Cowen,  1999).  Consoles  that  display  data  in  3-D  seem  to  provide  a 
natural,  and  increasingly  affordable,  solution  to  these  requirements. 

Various  3-D  display  technologies  are  either  available  or  under  development.  Some  technologies 
involve  the  true  representation  of  three  dimensions  in  a  holographic  image  (e.g.,  Lucente,  1997)  or  a 
volumetric  display  (e.g.,  Soltan  et  al.,  1998).  Most  3-D  technologies,  however,  display  a  3-D 
perspective  view  onto  a  flat  surface  such  as  a  CRT  or  LCD  panel.  The  image  is  two  dimensional  (2- 
D),  but  the  viewing  angle  provides  a  3-D  perspective.  For  example,  rather  than  displaying  an 
environment  from  directly  above  (a  planar  or  “bird’s  eye”  view),  perspective  view  technologies 
generally  display  the  environment  from  a  30-  or  45-degree  angle.  The  user  may  also  change  the 
perspective  to  view  any  desired  angle  (Dennehy,  Nesbitt  and  Sumey,  1994)  or  to  view  the  image 
stereoscopically.  However,  most  research  evaluating  3-D  displays  has  focused  on  performance  when 
using  stationary,  monocular  perspective  views. 

Many  potential  users  who  have  viewed  3-D  displays  are  positive  about  the  displays’  performance 
and  use.  However,  Andre  and  Wickens  (1995)  caution  system  designers  that  sometimes  "users  want 
what’s  not  best  for  them”  and  prefer  to  use  systems  that  hinder  rather  than  enhance  their  performance. 
Their  review  of  studies  on  input  devices,  display  interfaces,  and  color  and  3-D  rendered  on  flat 
screens  provides  evidence  to  support  this  hypothesis. 

While  it  may  be  naively  believed  that  more  dimensions  are  always  better,  the  evidence  is  decidedly 
mixed.  Across  an  array  of  tasks,  numerous  studies  have  found  benefits  for  3-D  perspective  over  2-D 
(Ellis,  McGreevy,  and  Hitchcock,  1987;  Bemis,  Leeds,  and  Winer,  1988;  Burnett  and  Barfield,  1991; 
Wickens  and  Prevett,  1995;  VanBreda  and  Veltman,  1998;  Andre  et  al.,  1991;  Haskell  and  Wickens, 
1993).  Other  studies  have  found  rough  parity  (Wickens  and  May,  1994;  Wickens  et  al.,  1996),  and 
still  other  studies  have  found  2-D  superior  to  3-D  (Boyer  and  Wickens,  1994;  Wickens  et  al.,  1995; 
Boyer,  et  al.,  1995;  O’Brien  and  Wickens,  1997).  The  details  of  tasks  and  interfaces  vary  widely,  and 
it  seems  likely  that  some  results  depend  more  on  these  details  than  on  the  nature  of  the  displays 
themselves  (e.g.,  Baumann,  Blanksteen,  and  Dennehy,  1997). 

Haskell  and  Wickens  (1993,  p.  104-105)  propose  that  3-D  perspective  view  displays  lead  to  better 
performance  whenever  the  tasks  to  be  performed  using  the  display  are  integrated  three-dimensionally 
or  whenever  the  method  of  performing  the  task  with  the  display  bears  a  strong  resemblance  to  a 
similar  task  performed  without  a  display.  For  flight  displays,  this  includes  flight  control  and 
identifying  and  making  integrated  judgments  regarding  other  aircraft.  In  these  cases,  the  similarity  of 
the  display  representation  to  the  view  in  visual-contact  flight  overcomes  possible  disadvantages  of 
the  3-D  format.  However,  for  tasks  that  require  focused  attention  and  that  do  not  have  a  visual  analog 
in  flight,  it  may  be  advantageous  to  create  separate  planar  displays. 

Unfortunately,  this  theory  fails  to  resolve  the  confusion  of  2-D  and  3-D  display  use  because  it  is  so 
difficult  to  predict  which  tasks  require  “focused  attention”  and  which  tasks  “require  integration 
across  dimensions”  (Haskell  and  Wickens,  1993,  p.  90). 

Despite  the  complex  and  confusing  nature  of  current  results,  if  we  are  to  influence  designs  for  the 
better,  the  time  to  do  so  is  now.  Industry  is  bringing  advanced  graphics  to  the  marketplace  for  entry- 
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level  personal  computers,  but  these  systems  are  designed  without  sufficient  regard  for  usability. 
Commercial  display  vendors  and  software  application  designers  agree  that  the  window  of 
opportunity  for  human-computer  interaction  (HCI)  design  guidance  for  3-D  displays  is  closing 
quickly  (e.g.,  Brandenburg,  1996). 

What  are  the  benefits  and  liabilities  of  3-D  versus  2-D  displays?  We  believe  that  the  main 
advantage  of  3-D  perspective  views  is  the  capability  to  easily  convey  the  shape  of  complex  objects 
such  as  molecules.  The  appeal  of  3-D  displays  may  well  stem  from  this  capability.  The  main 
disadvantage  of  3-D  perspective  views  seems  to  be  that  the  ambiguity  and  distortions  associated  with 
foreshortened  angles  and  distances  make  precise  judgments  of  distance  and  relative  position  difficult. 
Often,  natural  depth  cues  are  available  in  a  scene  that  can  be  used  to  compensate  for  the  effects  of 
foreshortening.  When  these  cues  are  unavailable,  however,  the  amount  of  distortion  and  ambiguity 
can  be  serious.  For  instance,  in  air  traffic  control,  the  aircraft  are  far  away  and  small,  so  few  depth 
cues  are  available,  and  it  is  quite  difficult  to  determine  their  distances  and  relative  positions.  Which 
aircraft  is  furthest  away?  Which  is  highest?  Is  that  balloon  in  the  flight  path? 

To  counteract  problems  associated  with  foreshortening,  many  display  engineers  have  experimented 
with  adding  artificial  depth  cues  to  3-D  perspective  views.  For  example,  a  common  artificial  cue  is  a 
“shadow”  that  lies  on  the  ground  directly  underneath  an  object,  such  as  an  airplane.  The  distance 
between  the  object  and  its  shadow  conveys  altitude,  and  the  location  of  the  shadow  on  the  ground 
conveys  position.  One  problem  with  shadows  is  that  for  aircraft  at  high  altitudes,  the  shadow  on  the 
ground  is  far  away  from  the  aircraft  and  may  appear  disconnected.  Another  problem  is  that  shadows 
double  the  number  of  objects  that  must  be  displayed,  which  adds  clutter  to  the  display. 

While  previous  research  has  tended  to  concentrate  on  specific  design  features  and  particular 
applications,  we  believe  that  we  may  gain  insight  into  the  controversy  surrounding  3-D  technologies 
by  focusing  on  the  fundamentals  of  visual  perception.  We  need  to  understand  how  the  view  of 
objects  impacts  different  cognitive  tasks.  In  this  report,  we  propose  a  new  theory  of  how  objects  are 
perceived  on  perspective  view  displays,  the  fundamental  limitations  of  these  displays,  and  how  and 
when  these  limitations  impact  different  cognitive/perceptual  tasks. 

Based  on  our  observations  of  3-D  advantages  and  disadvantages,  we  propose  a  hypothesis  to 
predict  when  3-D  perspective  views  would  benefit  or  harm  performance.  A  3-D  view  is  useful  for 
understanding  the  general  shape  of  complex  3-D  objects  because  it  integrates  the  three  dimensions 
into  a  single  view  and  provides  natural  depth  cues  such  as  perspective,  shading,  and  occlusion. 
Three-dimensional  views,  however,  impair  our  perception  of  the  relative  position  of  objects  because 
of  the  ambiguity  and  distortions  associated  with  foreshortened  angles  and  distances.  We  performed 
two  experiments  to  test  this  hypothesis. 
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EXPERIMENT  1:  OBJECT  UNDERSTANDING 


In  Experiment  1,  we  tested  the  hypothesis  that  a  3-D  perspective  view  leads  to  better  object 
understanding  than  a  2-D  view.  Our  goal  was  to  make  the  stimuli  simple  and  generic  in  the  hope  that 
our  results  would  apply  to  a  wide  variety  of  tasks  and  content  domains.  Consequently,  we  created 
simple  3-D  block  shapes  composed  of  10  to  16  cubes.  These  block  shapes  were  rendered  as  a  3-D 
perspective  view  or  as  a  set  of  2-D  views  (see  figure  1). 


Figure  1 .  3-D  rendering  of  a  typical  block  shape  used  in  experiments  (left  side) 
and  2-D  rendering  of  the  same  object  (right  side). 


“Object  understanding”  was  defined  by  four  different  identification-type  tasks:  (1)  Identify  3-D, 
(2)  Identify  Real,  (3)  Identify  Rotate-Yaw,  and  (4)  Identify  Rotate-Pitch.  In  the  Identify-3-D  task, 
participants  were  required  to  study  one  object,  rendered  in  either  2-D  or  3-D,  and  identify  that  same 
object  from  among  a  set  of  slightly  different  alternatives  rendered  in  3-D  (see  figure  2). 


Figure  2.  Three  answer  choices  for  the  ldentify-3-D  task  for  the  block  shown  in  figure  1 . 

The  3-D  renderings  in  the  multiple-choice  answer  set  provided  an  unfair  advantage  to  the  3-D 
condition  over  the  2-D  condition  because  in  the  3-D  condition,  the  correct  answer  looked  exactly  the 
same  as  the  study  block.  To  address  this  issue,  we  created  a  second  task,  the  Identify-Real  task,  in 
which  the  answer  sets  were  composed  of  wooden  blocks.  Participants  still  studied  either  2-D  or  3-D 
renderings,  but  they  picked  the  correct  answer  from  among  three  real  blocks. 
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The  third  object  understanding  task  was  Identify  Rotate-Yaw.  This  task  required  participants  to 
study  one  block,  rendered  in  either  2-D  or  3-D,  mentally  rotate  the  block  90  degrees  around  the 
vertical  axis  and  then  identify  the  correct  rotated  block  from  among  a  set  of  slightly  different 
alternatives  rendered  in  3-D  (see  figure  3).  The  fourth  task  was  Rotate-Pitch.  This  task  required 
participants  to  mentally  rotate  the  study  block  90  degrees  around  the  horizontal  axis  and  then  identify 
the  correct  block  from  among  three  slightly  different  alternatives. 


Figure  3.  Three  answer  choices  for  the  Rotate-Yaw  task  for  the  block  shown  in  figure  1 . 


Mental  rotation  of  an  object  requires  “object  understanding”  because  all  of  the  object’s  parts  must 
be  coordinated  in  the  rotation.  However,  mental  rotation  is  an  inherently  harder  task  than 
identification.  Together,  the  relatively  easy  identification  tasks  and  the  relatively  hard  mental  rotation 
tasks  provided  a  good  range  of  object  understanding  tasks. 

METHOD 

Participants 

The  participants  were  32  Navy  and  civilian  personnel  employed  at  the  Fleet  Anti-Submarine 
Warfare  Training  Center,  San  Diego,  California. 

Stimuli 

The  stimuli  were  10  2-D  and  3-D  renderings  of  simple  block  shapes  created  using  a  commercial 
off-the-shelf  (COTS)  graphical  package  and  presented  on  a  15-inch  liquid  crystal  display  (LCD) 
panel.  Each  object  was  composed  of  10  to  16  cubes  arranged  into  a  3-D  shape.  Recognizable  shapes 
were  carefully  avoided.  For  the  3-D  renderings,  a  camera  was  positioned  at  30  degrees  above  the 
horizontal  plane  of  the  object  and  at  such  an  angle  that  three  faces  of  the  object  were  visible.  In 
choosing  the  viewing  angle,  we  carefully  ensured  that  all  prominent  features  of  the  object  were 
visible.  An  omni-light  and  an  ambient  light  illuminated  the  objects,  and  a  single  spotlight  source 
above  and  90  degrees  to  the  right  of  the  camera  created  shading.  Shadows  on  the  ground  were  not 
rendered.  An  orthographic  perspective,  rather  than  a  vanishing  point  perspective,  was  used. 
Consequently,  the  objects  appeared  small  and  close  to  the  participant.  Figure  1  shows  a  3-D  block 
shape  example.  For  the  2-D  renderings,  a  top  view,  front  view,  and  right  side  view  of  each  object 
were  created.  The  three  views  were  labeled  and  arranged  as  shown  in  figure  1 . 

Procedure 

For  the  Identify-3-D,  Rotate-Yaw,  and  Rotate-Pitch  tasks,  a  trial  consisted  of  first  viewing  a  single 
study  object  rendered  in  either  2-D  or  3-D  in  the  upper  half  of  the  screen.  After  10  seconds,  three 
multiple-choice  answer  objects  appeared  in  the  lower  half  of  the  screen  while  the  study  object 
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remained  visible.  The  participant  used  a  mouse  to  choose  an  answer  object.  Reaction  times  to 
complete  each  trial  were  recorded  as  well  as  errors.  For  each  task,  there  were  three  practice  trials 
(using  simple  objects)  followed  by  10  test  trials. 

In  the  Identify-Real  task,  the  study  object  was  displayed  on  the  computer  screen  in  the  same  place 
as  the  other  tasks,  but  no  objects  were  shown  in  the  lower  half  of  the  screen.  Instead,  the  participant 
was  shown  wooden  blocks  on  a  table  to  the  left  of  the  computer  screen.  For  each  trial,  after  a  study 
object  was  shown  for  10  seconds,  a  tone  sounded  and  a  screen  was  lifted  to  reveal  the  wooden 
answer  blocks.  The  blocks  were  labeled  “a,”  “b,”  and  “c.”  Participants  chose  the  correct  block  by 
using  the  mouse  to  select  an  “a,”  “b,”  or  “c”  button  on  the  computer  screen. 

Thirty- two  participants  were  randomly  assigned  to  one  of  the  four  conditions:  (1)  Identify  3-D, 

(2)  Identify  Real,  (3)  Identify  Rotated  Yaw,  and  (4)  Identify  Rotated-Pitch.  Each  participant  received 
both  the  2-D  and  3-D  condition  for  his  or  her  assigned  task.  The  subjects  viewed  the  10  stimuli  only 
once  for  each  condition  (a  total  of  two  trials).  The  conditions  were  counterbalanced  across 
participants.  This  experimental  design  allowed  us  to  compare  performance  on  the  2-D  and  3-D 
condition  of  a  task  for  each  participant. 

Before  receiving  the  2-D  version  of  the  task,  participants  were  shown  a  video  loop  on  the  screen 
on  how  to  interpret  the  2-D  renderings.  The  video  showed  a  3-D  rendering  of  an  object  with  the  three 
2-D  views  wrapped  around  the  top,  front,  or  sides  of  the  object.  As  the  video  continued  to  play,  the 
three  views  unwrapped,  flattened  out,  and  moved  away  from  the  object  until  the  3-D  rendering  and 
the  2-D  rendering  were  side  by  side  (see  figure  4).  The  orientation  of  the  three  views  was  pointed  out 
carefully.  The  video  ran  forwards  and  backwards  for  as  long  as  the  participant  desired. 


Figure  4.  Three  frames  from  the  video  clip  shown  to  participants  to  explain  the  2-D  views. 

RESULTS 

For  each  participant,  response  times  (RT)  for  correct  trials  were  averaged,  and  a  percent  correct 
(PC)  score  was  calculated.  For  each  task  (n_=  8),  the  mean  RT  for  the  2-D  condition  was  compared  to 
the  mean  RT  for  the  3-D  condition  (see  figure  5).  The  mean  percent  correct  scores  also  were 
compared.  Participants  were  faster  and  more  accurate  with  the  3-D  views  on  the  Identify-3-D  task 
(RT:  t  (7)  =  7.25,  p  <.  001;  PC:  1(7)  =  4.58,  p  <  .003);  and  the  Identify-Real  task  (RT:  1(7)  =  3.52, 

P  <  .01;  PC:  1(7)  =  2.18,  p  <  .07).  Note  that  the  mean  differences  between  2-D  versus  3-D  are  about 
the  same  for  both  the  wooden  blocks  and  the  3-D  graphic  answer  sets.  This  suggests  that  the  benefits 
found  for  the  3-D  graphic  views  are  not  because  of  the  graphical  similarity  between  the  3-D  study 
stimuli  and  the  3-D  answer  stimuli,  because  the  same  benefits  between  the  study  and  answer  stimuli 
are  found  for  the  wooden  blocks.  Instead,  the  benefits  must  be  because  of  the  3-D  renderings  are 
easier  to  understand 
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Participants  were  also  faster  and  more  accurate  with  the  3-D  views  on  the  Rotate- Yaw  task 
(RT:  t  (7)  =  7.45,  p  <.  001;  PC:  1(7)  =  3.49,  p  <  .01),  and  on  the  Rotate-Pitch  task  (RT:  t  (7)  =  7.31, 
p  <.  001 ;  PC:  1(7)  =  2.43,  p  <  .05). 
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Figure  5.  Mean  response  times  (in  seconds)  and  percent  correct  scores  for  the  object 
understanding  tasks. 


DISCUSSION 

The  3-D  view  is  better  for  understanding  the  shapes  of  simple  blocks.  Participants,  on  average, 
seem  to  be  about  three  times  slower  and  about  10  percent  less  accurate  on  shape  understanding  tasks 
for  2-D  views  compared  to  3-D  views.  The  obvious  weakness  of  the  2-D  views  for  any  of  object 
understanding  tasks  is  that  the  top-down,  front,  and  side  views  must  be  integrated  into  a  single  object, 
which  takes  time.  In  the  perspective  drawings,  by  comparisons,  the  views  are  already  integrated  as  a 
single  object.  Understanding  the  shape  of  the  object  is  uncomplicated  when  all  of  the  relevant  object 
features  are  visible  in  the  perspective  view. 
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EXPERIMENT  2:  RELATIVE  POSITION 


In  experiment  2,  we  investigated  the  effectiveness  of  2-D  and  3-D  views  for  determining  the 
relative  position  of  two  objects.  According  to  our  hypothesis,  the  relative  position  of  two  objects 
should  be  easier  to  determine  with  2-D  views  than  3-D  views  because  the  normalized  viewing  angles 
(e.g.,  top-down,  side  view)  of  2-D  views  eliminate  the  foreshortening  distortions  found  in  3-D 
perspective  views.  “Relative  Position  Understanding”  was  defined  by  three  different  tasks:  (1)  the 
Over-Same  task,  (2)  the  Over-Different  task,  and  (3)  the  Navigation  task. 

In  the  Over-Same  task,  participants  were  presented  with  a  2-D  or  3-D  view  that  consisted  of  a 
simple  block  shape  (cf.,  Experiment  1)  and  a  ball  the  size  of  a  single  cube  that  was  always  located 
somewhere  above  the  block.  The  participant’s  task  was  to  determine  which  cube  of  the  block  was 
directly  underneath  the  ball  and  to  click  on  that  cube  using  a  mouse.  From  a  single  view  in  the  3-D 
condition  (see  figure  6),  there  is  more  than  one  correct  location  because  the  height  and  distance  of  the 
ball  are  ambiguous.  This  ambiguity  derives  from  the  very  nature  of  the  3-D  viewing  angle:  it  cannot 
be  determined  whether  the  ball  is  high  up  and  toward  the  front  of  the  block  or  low  down  and  toward 
the  rear  of  the  block.  Consequently,  we  provided  a  second  view  of  the  block  and  ball  from  a  different 
viewing  angle  to  eliminate  this  ambiguity.  Participants  could  compare  the  views  to  determine  the 
height  of  the  ball  and  determine  which  cube  lay  underneath  the  ball. 


Figure  6.  3-D  version  of  a  typical  block  and  ball  used  in  the  relative  position  task.  Note  that  for  each 
individual  view,  the  location  of  the  ball  is  ambiguous,  but  by  comparing  the  two  views,  the  location  of 
the  ball  can  be  determined. 


In  the  2-D  condition  of  the  Over-Same  task,  the  top-down,  front,  and  side  views  of  the  block  and 
ball  were  presented  (see  figure  7).  The  2-D  condition  of  the  task  should  be  very  easy  to  perform 
because  participants  only  need  to  look  at  the  top  view  to  determine  which  cube  lies  directly 
underneath  the  ball.  In  fact,  if  you  click  on  the  ball,  you  will  click  on  the  correct  answer. 

One  might  argue  that  this  task  is  artificially  easy,  but  we  believe  that  it  is  simply  the  nature  of  2-D 
normalized  views,  which  make  relative  position  tasks  like  this  one  simple  and  obvious.  Nevertheless, 
we  developed  a  second  task,  the  Over-Different  task,  to  address  this  issue.  In  the  Over-Different  task, 
participants  were  presented  with  2-D  or  3-D  views  of  the  block  and  the  ball  as  described  in  the  Over- 
Same  task.  Again,  their  task  was  to  determine  which  cube  was  underneath  the  ball.  However,  this 
time,  participants  were  presented  in  another  window  a  3-D  view  of  the  block  with  the  ball  removed. 
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Participants  used  this  3-D  view  of  the  block  alone  to  report  their  answer.  In  Experiment  1,  we  found 
that  a  3-D  view  was  valuable  in  understanding  object  shape.  Consequently,  clicking  on  a  cube  in  a 
3-D  view  of  the  block  was  a  better  way  to  demonstrate  understanding  the  location  of  the  ball.  Thus, 
in  the  2-D  condition  of  the  task,  participants  simply  could  not  find  the  ball  in  the  top  view  and  click 
on  the  ball  to  answer.  Instead,  they  had  to  click  on  the  correct  cube  in  the  3-D  view. 


Figure  7.  2-D  views  of  the  block  shape  and  ball  in  figure  5. 

In  the  third  task,  Navigation,  participants  determined  how  to  move  from  a  designated  cube  (shown 
in  red)  in  the  block  shape  to  reach  the  ball.  The  Navigation  task  immediately  followed  the  Over- 
Same  task  for  each  stimulus  trial.  After  clicking  on  the  underneath  cube,  compass  points  appeared  on 
the  screen  to  indicate  North,  South,  East,  West,  Up,  and  Down.  Participants  indicated  the  number  of 
moves  in  “cubes”  in  each  direction  to  move  from  the  red  cube  to  the  ball.  In  the  example  shown  in 
figures  6  and  7,  you  must  move  3  up,  2  east,  and  1  north  to  get  from  the  red  cube  to  the  ball. 

METHOD 

Participants 

The  participants  were  24  Navy  and  civilian  personnel  employed  at  the  Fleet  Anti-Submarine 
Warfare  Training  Center,  San  Diego,  California.  This  experiment  was  conducted  approximately  2 
months  after  the  completion  of  Experiment  1.  About  half  of  these  subjects  participated  in  Experiment 
1. 


Stimuli 

For  the  relative  position  tasks,  the  stimuli  were  the  same  10  simple  block  shapes  used  in 
Experiment  1  with  the  addition  of  a  ball  displayed  somewhere  above  the  block.  The  diameter  of  the 
ball  was  identical  to  the  length  of  one  of  the  cubes  making  up  the  block.  The  ball  was  located  in 
empty  space  from  one  to  three  diameter-lengths  above  the  block.  The  ball  did  not  cast  a  shadow. 

For  the  3-D  condition,  two  views  of  the  block  and  ball  were  rendered.  For  each  view,  the  camera 
was  positioned  at  30  degrees  above  the  horizontal  plane  of  the  block  and  at  45  degrees  to  the  left  or 
right  of  the  front  view.  Thus,  for  a  single  view,  the  location  of  the  ball  was  ambiguous  because  the 
cubes  underneath  the  ball  line  up  along  a  diagonal  of  the  block  shape.  The  ball  appears  to  be  floating 
over  each  of  the  cubes  along  the  diagonal.  One  cube  was  colored  red  for  use  in  the  Navigation  task 
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described  below.  For  the  2-D  condition,  a  top  view,  front  view,  and  right  side  view  of  the  block  and 
ball  were  rendered. 

Procedure 

On  each  trial,  participants  saw  a  block  and  ball  configuration  in  either  2-D  or  3-D.  They  first 
performed  the  Over-Same  task  by  using  the  mouse  to  click  on  the  cube  directly  underneath  the  ball. 

In  the  2-D  version,  participants  were  informed  that  they  could  click  on  cubes  in  any  of  the  three 
views,  but  that  using  the  top  view  would  be  easiest.  In  the  3-D  version,  participants  saw  two  views  of 
the  same  block  and  ball  configuration  and  were  instructed  to  click  on  a  cube  in  the  block  shape 
rendered  on  the  right  side  of  the  screen.  Immediate  feedback  was  provided  and  the  participant  could 
not  continue  until  the  correct  cube  was  identified.  Total  time  to  find  the  correct  cube  and  first  choice 
errors  were  recorded. 

On  each  trial,  immediately  after  answering  in  the  Over-Same  task,  participants  performed  the 
Navigation  task  for  that  stimulus.  Following  a  correct  answer  in  the  Over-Same  task,  a  compass  point 
icon  appeared  next  to  the  block  and  direction  menus  appeared  below  the  block.  There  were  three 
separate  direction  menus:  North/South,  East/West,  and  Up/Down.  When  participants  selected  a 
direction  menu,  a  pop  up  list  of  distances  appeared  on  the  screen.  For  example,  when  the  North/South 
menu  was  selected,  the  list  showed  North  4,  North  3,  North  2,  North  1, 0,  South  1... South  4.  After 
participants  chose  a  distance,  the  menu  would  close  and  display  the  chosen  distance  for  that 
dimension.  After  choosing  distances  along  all  three  dimensions,  participants  clicked  on  a  “submit” 
button.  Their  answer  was  evaluated  and,  if  correct,  the  participant  continued  to  the  next  trial.  After 
three  incorrect  tries,  the  program  moved  on  to  the  next  trial  and  recorded  a  failure.  Total  time  to 
select  a  correct  answer  was  recorded.  Additionally,  we  recorded  first-try  errors. 

All  participants  received  six  practice  trials  on  the  Over-Same/Navigation  tasks  followed  by  10 
experimental  trials.  Half  of  the  participants  received  the  2-D  condition  followed  by  the  3-D 
condition,  and  half  of  the  participants  received  the  reverse  order.  Next,  half  the  participants  were 
shown  the  2-D  condition  of  the  Over-Different  task  and  half  were  shown  the  3-D  conditions  of  the 
Over-Different  task.  Participants  did  not  receive  both  conditions  of  the  Over-Different  task  in  an 
effort  to  keep  participants  from  becoming  too  familiar  with  the  stimuli. 

RESULTS 

Figure  8  shows  mean  response  times  and  percent  correct  scores.  For  the  Over-Same  task  (n  =  24), 
participants  were  faster  and  more  accurate  using  the  2-D  views  than  the  3-D  views  (RT:  t  (23)  =  7.7, 
p  <  .0001;  PC:  t  (23)  =  6.0,  p  <  .0001).  It  is  difficult  to  determine  the  relative  position  of  two  objects 
using  the  3-D  views.  Any  one  view  is  ambiguous  and  two  views  of  the  block  and  ball  must  be 
compared  to  resolve  relative  position  ambiguities.  The  2-D  views  were  found  to  be  much  more 
effective,  but  as  previously  mentioned,  the  top  view  of  the  block  shape  in  the  2-D  condition  made  the 
task  very  easy. 

The  Over-Different  task  was  designed  to  remedy  this  complaint  by  requiring  participants  to 
respond  by  pointing  to  cubes  in  a  separate  3-D  view.  Nonetheless,  participants  were  still  faster, 
though  not  reliably  more  accurate,  using  the  2-D  rather  than  the  3-D  display  (RT:  t  (22)  =  2.9,  p  < 
.008;  PC:  t  (22)  =  1.3,  p  >  .05).  The  smaller  effect  size  for  this  task  compared  to  the  Over-Same  task 
was  due,  at  least  in  part,  to  this  task  using  a  between-participants  design. 
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Figure  8.  Mean  response  times  (in  seconds)  and  percent  correct  scores  for  relative  position  tasks. 


Navigation  task  participants  again  performed  faster  and  more  accurately  using  the  2-D  views  than 
the  3-D  views  (RT:  t  (23)  =  7.3,  g  <  .0001;  PC:  t  (23)  =  7.1,  p  <  .0001).  What  is  interesting  about  the 
results  from  the  navigation  task  is  that  the  difficulties  associated  with  the  3-D  views  are  likely 
because  of  foreshortening  distortions  in  the  3-D  display  rather  than  the  ambiguity.  Since  the 
Navigation  task  directly  followed  the  Over-Same  task  on  each  trial,  participants  had  already  resolved 
the  ambiguity  about  the  location  of  the  ball.  However,  the  foreshortening  in  the  3-D  display  distorts 
angles  and  distances,  making  them  difficult  to  judge.  Participants  reported  that  height  was  especially 
difficult  to  judge  because  it  required  estimating  distances  across  empty  space. 

DISCUSSION 

How  well  can  we  discern  the  relative  positions  of  multiple  objects  in  the  3-D  environment?  For  the 
Over  tasks  and  the  Navigation  task,  the  2-D  display  was  superior  to  the  3-D  display.  For  the  Over 
tasks,  the  culprit  was  ambiguity  in  the  3-D  perspective  view.  It  is  impossible  to  determine  which 
cube  lies  directly  underneath  the  ball  in  a  single  view.  A  second  view  from  another  angle,  which  is 
equally  ambiguous,  must  be  added,  and  then  the  two  views  must  be  compared  to  find  the  correct 
location.  The  2-D  views  from  the  front  and  side  are  also  ambiguous.  However,  the  top  view  is 
entirely  unambiguous  for  determining  which  cube  is  underneath  the  ball.  In  fact,  the  top  2-D  view 
makes  the  task  trivial:  Find  the  ball  and  you  have  found  the  cube,  too. 

For  the  Navigation  task,  the  culprit  for  the  3-D  views  is  not  the  line-of-sight  ambiguity,  because 
the  ambiguity  is  resolved  in  the  Over-Same  task.  Instead,  the  culprit  seems  to  be  the  distortion  in  the 
3-D  perspective  view  caused  by  foreshortening,  which  distorts  the  angles  and  distances  between 
objects.  In  the  2-D  views,  there  is  no  distortion  of  angles  or  distances.  Instead,  each  dimension  is 
presented  faithfully.  Consequently,  it  is  easy  to  judge  and  move  specific  directions  and  distances 
along  each  dimension.  To  move  in  the  dimension  that  is  not  represented,  one  simply  has  to  turn  to 
another  view  where  that  dimension  is  represented  faithfully.  Of  course,  to  view  another  dimension 
requires  a  shift  of  the  eyes  and  a  re-orientation  to  the  object  or  scene,  but  this  perceptual  shift  does 
not  seem  to  hinder  performance  as  much  as  dealing  with  the  distortions  in  the  3-D  views. 
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GENERAL  DISCUSSION 


To  summarize  our  findings,  a  single  3-D  perspective  view  was  far  superior  to  three  2-D  views  for 
understanding  the  shape  of  the  simple  blocks  used  in  Experiment  1 .  However,  the  2-D  views  were  far 
superior  to  two  3-D  views  for  understanding  the  relative  positions  of  two  objects.  We  believe  these 
results  have  profound  implications  for  the  design  of  visualization  software  from  maps  and  geoplots  to 
structural  illustrations.  The  choice  of  2-D  or  3-D  views,  therefore,  depends  on  the  relative  advantages 
and  disadvantages  of  3-D  displays  for  conveying  different  types  of  information  and  which  types  of 
information  a  task  requires. 

There  are  three  main  advantages  and  two  main  disadvantages  of  using  3-D  perspective  views.  The 
advantages  are  that  3-D  perspective  views  (1)  integrate  all  three  dimensions  into  a  single  rendering, 

2)  can  be  enhanced  with  supplementary  depth  cues  (e.g.,  shadows,  shading),  and  (3)  allow  features  of 
an  object  to  be  depicted  that  would  be  invisible  in  a  normal  2-D  view.  The  integration  of  all  three 
dimensions  into  a  single  rendering  is  very  useful  for  understanding  shape.  With  2-D  views,  no  one 
view  can  provide  information  about  all  three  dimensions  of  an  object.  To  present  a  third  dimension,  a 
separate  view  must  be  added.  For  example,  one  view  might  show  length  and  width  but  no  depth. 
Another  view  would  have  to  be  added  to  show  depth  (and  either  width  or  length).  Information  about 
an  object  or  scene  must  then  be  combined  mentally,  which  is  both  difficult  and  time-consuming.  A 
perspective  view  is  easier  to  use  because  it  integrates  the  dimensions  in  the  view  itself. 

The  second  advantage  of  3-D  views  is  that  extra  depth  cues  can  be  added  such  as  shadows,  object 
scaling  (i.e.,  distant  object  features  are  drawn  smaller)  and  shading.  Applied  to  a  3-D  wire  frame 
drawing,  they  make  the  3-D  shape  of  the  object  immediately  apparent.  Depth  cues  are  difficult  to  add 
to  2-D  views. 

The  third  advantage  of  3-D  views  is  that  that  they  allow  for  the  illustration  of  object  features  that 
would  be  hidden  in  a  normalized  2-D  view.  Pockets  and  holes  can  be  rendered  with  depth  cues  that 
would  otherwise  appear  flat  in  one  normalized  view  and  invisible  in  others.  In  figure  9,  the  large 
pocket  on  top  is  more  difficult  to  understand  in  2-D  than  in  3-D  and  the  small  pocket  in  the  notch  is 
impossible  to  view  from  any  normal  2-D  angle.  Further,  the  slanted  cutout  is  indistinguishable  from 
the  notched  cutout. 

The  two  disadvantages  of  3-D  perspective  views  are  the  line-of-sight  ambiguities  and  geometric 
distortions  caused  by  foreshortening.  Both  of  these  disadvantages  are  exacerbated  when  the  depicted 
scene  is  composed  of  small  objects  separated  by  empty  space  because  there  are  few  depth  cues  that 
can  be  used  to  compensate  for  the  distortions.  Because  perspective  views  are  oblique  renderings, 
parallel  lines  are  represented  as  converging  lines.  The  perspective  angle  of  the  scene  will  affect  our 
judgment  of  object  shapes,  slants,  and  distances.  Right  angles  appear  either  acute  or  obtuse,  and 
rectangles  appear  as  trapezoids.  As  sides  of  an  object  approach  the  line  of  sight  of  the  rendering, 
lengths  shorten  until  they  become  entirely  invisible,  leading  from  distortion  to  complete  ambiguity. 

Directly  along  the  line  of  sight,  it  is  impossible  to  tell  where  an  object  resides.  In  a  2-D  display, 
this  line  of  sight  ambiguity  corresponds  with  the  missing  dimension.  In  a  3-D  display,  the  line- 
of-sight  ambiguity  falls  along  all  three  dimensions  along  a  vector  in  the  scene  beginning  at  any  point 
on  the  screen’s  surface  extending  to  the  vanishing  point  in  the  scene.  We  believe  that  confining  the 
ambiguity  to  a  single  dimension,  while  faithfully  representing  the  other  two  dimensions,  as  in  a  2-D 
display,  is  easier  to  think  about  and  deal  with  than  spreading  the  ambiguity  across  all  three 
dimensions  and  representing  none  faithfully. 
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Figure  9.  Top  view,  front  view,  and  perspective  view  of  block  with  cutouts  and  pockets. 


The  represented  dimensions  in  the  2-D  views  are  not  distorted;  what  is  visible  is  accurately 
represented.  The  user  knows  that  scene  ambiguities  such  as  elevation  in  a  plan  view  can  be  resolved 
by  viewing  the  missing  third  dimension.  This  confinement  of  ambiguity  to  the  dimension  that  is  not 
represented  provides  better  opportunities  to  deal  with  the  ambiguity.  A  user  can  easily  switch  among 
a  set  of  2-D  views  to  obtain  undistorted  information  about  each  dimension  of  interest.  In  contrast, 
resolving  scene  ambiguities  using  multiple  3-D  views  requires  substantial  effort  as  demonstrated  in 
Experiment  2. 

Could  the  ambiguities  associated  with  3-D  perspective  views  be  reduced  by  adding  additional 
natural  depth  cues  to  our  stimuli,  such  as  texture  gradients,  interposition,  atmospheric  perspective  (in 
which  the  scene  is  viewed  from  above  and  at  an  angle,  usually  45  degrees),  shading,  and  brightness 
(Schmidt,  1997).  A  key  issue  is  how  well  the  available  depth  cues  can  be  used  in  a  scene.  For  a 
complex  object  or  scene,  many  of  these  cues  can  be  used  effectively  to  convey  depth.  However,  for  a 
set  of  objects  separated  by  empty  space,  such  as  aircraft  approaching  an  airport  or  air  corridors  and 
missile  routes  over  terrain,  few  depth  cues  can  be  used.  Shading,  texture  gradients,  and  object  scaling 
are  all  of  limited  value  because  they  cannot  be  drawn  on  empty  space,  yet  the  critical  relationships 
between  the  objects  are  defined  by  empty  space.  Since  natural  depth  cues  are  unavailable  for  these 
situations,  the  ambiguity  and  distortions  inherent  with  3-D  perspective  views  generally  cannot  be 
mitigated.  Consequently,  when  the  information  of  interest  is  the  relationships  among  distinct  objects, 
a  3-D  perspective  view  can  be  seriously  detrimental  to  accurate  perception. 

Finally,  3-D  views  may  be  less  useful  for  understanding  free-form  objects  such  as  terrain.  For 
regular  objects,  it  is  relatively  easy  to  compensate  for  the  distortions  introduced  by  a  3-D  view.  For 
example,  the  stimuli  used  in  these  experiments  were  composed  of  equal  size  cubes  and  right  angles. 
The  relationships  among  features  of  these  objects  are  easy  to  understand  because  we  know  the  true 
angles  and  distances.  However,  with  free-form  objects,  we  are  generally  unable  to  compensate  for 
distortions  caused  by  3-D  views.  M.  C.  Escher  took  advantage  of  this  idea  in  many  of  his  drawings 
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by  misapplying  depth  cues  to  create  impossible  figures.  For  example,  in  Ascending  and  Descending, 
Escher  misrepresented  angles  and  distances  to  create  an  infinite  staircase  that  wraps  into  itself. 

In  future  work,  we  plan  to  look  more  closely  at  the  question  of  understanding  the  shapes  of  free¬ 
form  objects  such  mountain  terrain  or  even  blood  vessels.  Without  regular  distances  and  right  angles 
to  compensate  for  the  distortions  caused  by  foreshortening,  the  shapes  of  these  free-form  objects 
when  viewed  in  3-D  may  be  difficult  to  discern  with  any  precision.  We  speculate  that  there  still  may 
be  some  advantage  for  understanding  the  general  layout  of  such  objects  in  3-D,  but  that  this 
advantage  will  diminish,  as  more  precise  judgments  are  required. 

Do  more  advanced  3-D  display  technologies,  such  as  stereoscopic  or  volumetric  displays, 
demonstrate  the  same  limitations  as  3-D  views  on  flat  screens?  Both  the  artificial  stereopsis  of 
polarized  lenses  and  the  natural  stereopsis  of  a  true  volumetric  display  add  depth  to  a  3-D  view.  In  a 
meaningful  sense,  objects  that  are  further  away  are  really  further  away.  Ambiguity  along  the  line  of 
sight  is  reduced  because  each  eye  is  offset  and  has  a  slightly  different  line  of  sight.  However, 
distortions  from  foreshortening  remain.  Therefore,  while  it  may  be  possible  to  use  stereoscopic 
displays  to  marginally  improve  a  rough  sense  of  relative  position,  precision  judgments  will  remain 
difficult.  Therefore,  tasks  that  require  locating  the  positions  of  many  objects  in  space,  such  as  air 
traffic  control  or  air  warfare,  will  always  benefit  from  the  accurate  representations  found  in  2-D 
views. 

AN  OPERATIONAL  CONCEPT:  ORIENT  AND  OPERATE 

Our  findings  imply  that  combining  both  2-D  and  3-D  views  may  prove  optimal  for  use  in 
operational  military  settings.  The  display  interface  would  embrace  a  concept  that  we  call  “Orient  and 
Operate.”  Users  orient  to  the  layout  of  a  scene  using  a  3-D  view,  but  then  switch  to  2-D  views  to 
interact  with  and  operate  on  the  scene.  For  example,  in  Land  Attack  Warfare,  missiles  are 
programmed  to  fly  routes  from  launch  platforms  located  off  the  coast  to  targets  on  land.  These 
missiles,  such  as  Tomahawks,  are  non-ballistic  and  have  the  capability  to  follow  the  terrain  up,  down, 
and  around.  In  addition  to  following  the  terrain,  the  missile  routes  can  be  set  to  avoid  anti-missile 
radar  envelopes  and  other  known  obstacles,  including  designated  air  sortie  corridors.  Users  often 
need  to  review  and  evaluate  these  routes  to  check  for  potential  conflicts  with  new  or  prospective 
obstructions. 

A  3-D  view  may  work  best  to  gain  a  basic  grasp  of  the  terrain,  the  shapes  and  locations  of  missile 
routes  and  obstacles  (see  figure  10).  However,  this  display  is  too  ambiguous  and  distorted  for  precise 
judgments.  For  example,  in  the  3-D  view,  it  cannot  be  determined  whether  the  yellow  air  corridor  is 
next  to  or  above  the  missile  route  shown  in  black.  Also,  the  location  of  the  ball  could  be  resting  on 
the  plain  or  floating  over  the  mountains,  and  the  circular  orange  radar  envelope  appears  oval. 

Once  a  rough  sense  of  the  layout  and  the  shapes  of  routes  are  obtained,  a  2-D  view  may  work  best 
for  achieving  a  precise  grasp  of  relative  positions  and  exact  shapes.  As  another  example,  a  user  may 
have  difficulty  using  a  3-D  perspective  view  to  effectively  change  a  missile  route  to  avoid  a  new 
radar  envelop,  but  may  find  the  same  task  easy  using  the  2-D  view  shown  in  figure  10.  There  are 
numerous  design  options  to  delineate  this  concept  of  Orient  and  Operate,  such  as  presenting  both  3-D 
and  2-D  views  simultaneously  or  allowing  the  user  to  select  the  viewing  angles.  The  goal  of  our 
future  work  is  to  find  the  best  interface  design. 
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In  summary,  we  have  found  previous  research  to  be  confusing  regarding  potential  benefits  and 
limitations  of  3-D  displays.  Our  strategy  has  been  to  step  back  from  more  applied  studies  to  consider 
the  fundamental  capabilities  and  limitations  of  2-D  and  3-D  views,  and  ask  what  tasks  best  fit  those 
capabilities.  The  compelling  nature  of  3-D  views  seems  to  reside  in  their  integration  of  all  three 
dimensions  into  a  single  view  and  their  “natural”  representation  of  space.  Yet  this  natural  representa¬ 
tion  is  fraught  with  ambiguity  and  distortion.  Using  renderings  of  simple  blocks,  we  found  that  each 
display  can  be  useful  (3-D  for  understanding  shape  and  2-D  for  understanding  relative  position). 
Finally,  we  recommend  the  Orient  and  Operate  display  design  concept,  which  maximizes  the  benefits 
of  each  type  of  display  (3-D  for  orientation  to  a  scene  and  2-D  for  precision  and  relational 
judgments).  In  future  work,  we  plan  to  refine  this  concept  by  exploring  ways  to  optimize 
2-D  and  3-D  views  using  natural  and  artificial  depth  cues. 
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